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ABSTRACT 



A comparison is drawn between norm- ref erenced (or 
competitive) and c riterion— ref erenced testing procedures as used for 
the purpose of assessing and grading knowledge gained from teaching. 
It is argued that competitive or norm- referenced grading practices 
are sadistic, unethical, statistically unsound, and irrelevant to 
course objectives. Criterion- ref erenced procedures are advocated as 
alternatives which avoid the problems of norm- ref erenced testing, 
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Periodically, it pays educators to question their most traditional practices 
in the hope that such questioning wil I lead to improvements where needed, or 
at I ©ast to force us to provide rationales for our practices and thus keep us 
from tho educational mf nd lossnoss 1 ’ which, Charles C. SI Iberm&n convincingly 
documents in Crl_s i,£_ Cl a ss room , With this goal In mind, I would like to 



present a point of view on one of the integral parts of teaching as we now know 
it — the practice of grading. 

An analysis of grading practices is critical for at least two reasons: first 

because much of the way students spend their time learning in a given course is 
determined by the grading procedure of the i nstructor; and second, because the 



way in which grading is presently handled, probably by upwards of 90^ of the 
faculty at every educational Institution in the country, is atrocious. 

The traditional and most widely used grading system is one in which the 
instructor evaluates the students by differentially ranking or grading them on 
the basis of their differential performances, usually on a test or paper. Thus 
it is not unusual forth© instructor to give midterm and final examinations; 
to combine the two scores for each student in some way to arrive at a final course 
average; and, on the basis of the* distribution of scores on this final average, 
to decide which students should receive grades of A,B,C, or whatever. Such a 
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This article is based on a viewpoint expressed in the Reporter, State Universitv 
of New York at Buffalo, March 25, 1971, Vol. 2, No. 26, p,4. 
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procedure Is technically known as norm-re ferenced testing since any single 
person’s score is Interpreted in terms of the scores of the other students -- the 
norm, I prefer the term competitive testing for this procedure since one student 
is forced to compete with another in order to be stamped with the instructors 
and society’s seal of approval. 

To be blunt about it, competitive grading practices are sadistic, unethical . 
statistically unsound, and irrelevant to the course objectives * Competitive 
testing Is, first of all, sadistic because it pits one student against another, 
setting up a situation in which cooperation among students is risky* it en- 
courages cheating whenever the probability of being caught is low (which is usually 
always when the students are ingenious, which is also usually always), and it 
probably contributes to the loss of library books and journal pages in universities 
and to writing on the table tops and coded messages in the public schools. Some 
people might prefer to think of these evidences of competition as providing in- 
creased motivation to study hard. I am convinced that they increase motivation to 
find ways to beat the system, but I doubt that they result in increases In learn- 
ing course concepts. Even if competitive testing did Increase motivation, it is 
still sadistic because it is based on the assumption that everyone cannot or 
should not succeed in the course — - that is, achieve the agreed upon objectives. 
Thus the i nstructor is telling each student who does not receive the highest 
grade that he Is not as good as the other students. 

This is also where the ethics of the procedure enters, though it Is more 
apparent if you think of the effects of competitive testing at the elementary 
school level. In a society In which the school’s function is to provide each 
child with basic skills necessary for him to select his own pursuit of happiness, 
each student has a right to succeed in a course. In elementary school,, society 
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expects each child to succeed In various ski I Is such as the three R ! s, coopera- 
tive play, etc* In col lege, students pay for the privilege of gaining knowledge, 
skills* or degrees* At either end of the continuum the end result should be the 
student f s success in these goals* In a competitive system, however, any time 
one student succeeds, at least one other student falls. Teachers do not have 
the right to play God and decide who should and should not be successes in life. 
Even were they competent to make such judgments, making them is not part of their 
job, which is to help and encourage each student achieve the ob jectives of the 
course . More on this later. 

The third point abor ' competitive testing has to do with the grades of 
students which are derived from the test scores* Distributions of test scores 
have been the subject of a great amount of research by scholars in the discipline 
of educational and psychological measurement. Many concepts could b© invoked 
from this field to support an argument that most tests constructed by I nstructors 
have major flaws, one of the most serious being that they provide unreliable 
measures. However, it is not necessary to argue the statistical unsoundness of 
competitive grading practices on such esoteric points* Even If the instructor’s 
test were technically sound and perfectly reliable — which no test is — any 
different? a I grading or ranking of students on the basis of the test scores would 
sti II be an arbitrary process. It is arbitrary whether the scores are raw, co— 
verted, normalized, standardized, or otherwi sed* It Is arbitrary because at almost 
any cutoff point the score which falls into the area receiving one grade is not 
significantly different from the closest score falling Into the area receiving 
an adjacent grade. The finer the discriminations in grading, the more arbitrary 
the process. Thus, if an instructor decides that a score of 90% correct will 
earn an A, but Q9% will earn a B, he has arbitrarily made this decision, since 
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In point of fact there is no reliable difference between a score of 89 and 90, 

In fact, for most tests there is probably no significant difference between 
scores of 80 and 90, 

In a statistical sense, scores which are not significantly different from 
one another should be considered to be equivalent, and students w ho receive those 
scores should be considered to have learned the material equally well. In case 
we have some nonbelievers on this point, I invite any statistical expert to 
describe the conditions under which adjacent scores receiving different grades 
could be considered to be significantly different, and thus provide us with a 
rational, non-arb 1 trary procedure for competitive grading. I predict that, in 
practice, the distribution of scores necessary to sustain the obtained rational 
procedure wi I I never occur. 

All of the above reasons are sufficient in and of themselves to warrant the 
immediate (and even retroactive, if It were possible) cessation of competitive 
grading practices. Nevertheless, to add insult to injury, the teacher who engages 
in such practices is not even doing his job: that is, the whole practice of 

testing and grading competitively is irrelevant to the process of teaching. 

Teaching is an activity which cannot be divorced from learning. To para^ 
phrase Dewey, we should laugh at a salesman who said he sold many items when no 
one bought any. The analogy to learning is perfect: if any student has not 

learned, the teacher has not taught him. It is irrelevant to that student if 
other students have learned. We can say that the teacher was successful in the 
case of those who learned, but we must also say that he failed In the case of 
those who did not learn. Of what use, then. Is a rank ordering of students 
from best to worst in performance? What the instructor needs to determine is 
those students who have achieved the course objectives and those who have not. 
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For those who have not, the instructor needs to teach them so that they ach i eve 
the objectives* Why else are faculty members being paid to teach? 

For those who would argue that businesses, industry, and graduate schools 
depend upon the class ranks of students, 1 argue that is too bad* It is not 
the teacher’s job, but the selector’s (employer or school), to devise ways to 
select students for their positions. The sooner the schools get out of the 
business of maintaining competitive permanent record files (which seem indestruc- 
table and available to almost anyone but the student, the better. 

Having presented my case against traditional grading procedures, I want 
to be careful not place myself on the side of those who have reacted so 



strongly against the evils of these procedures to have gone to the opposite 
extreme -- the extreme of no assessment of learning* By the argument in the last 
few paragraphs It can be seen that assessment of what has been learned is an in- 
tegral part of teaching* Without It, teaching cannot be claimed to have occurred , 
Thus, some assessment of whether each student has attained the objectives is 
necessary, although tests are not the only way of assessing this. Too many, 
especially In some of the free school movements, have abandoned assessment en- 
tirely. While this is a neat solution to the problems of sadism, unethical Ity 
and statistical unsoundness which have been raised, and often has the added 
attractions of letting students participate in the establishment of objectives, 
we cannot claim that teaching has occurred unt i I assessment of tha learning re- 



veals that the objectives have been met* 

Some may argue that learning has occurred even if the objectives have not 
been attained, and they are probably correct. However, learning is a continual 
process and does not need a teacher. Thus, if you want to justify yourself as 
teacher, you must demonstrate that what you taught to a given student was 

learned by him* 
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If neither the competitive assessment nor nonassessment approaches to grad- 
ing is appropriate to excellent teaching, what is? As I see it, the answer lies 
in what Is called criterion-referenced or mastery testing, The procedure involved 
is so named because each student Is evaluated in relation to the course objectives 
taken as the criterion, and he must demonstrate that he has mastered these objec- 
tives. This means that each student is judged solely in relation to these pre- 
established criteria (which again could have been determined by the student, the 
instructor, or both) and independently of the performance of any other student. 

It also means that unti I a student has learned the material, he is merely In the 
process of learning. Students who take longer than others to learn should not 
be stigmatized, but should be helped to learn. (Often it is possible to have 
other students assist them since cooperation will not harm anyone’s class rank.) 
Individual differences in rate of learning, of course, will still exist, but 
from the standpoint of the Instructor they are unrelated to his purpose • — namely, 
to have each student achieve the course objectives. 

The perceptive reader will have noted that cri ter ion-referenced testing does 
not eliminate the arbitrary process of grading, since you still have a pass-fail 
cutoff and, wherever you set that cutoff, there is likely to be a nonsignificant 
difference between the scores most adjacent fo it. In this kind of a grading 
system, however, there Is nothing malicious about the arbitrary nature of the 
cutoff. This is because each student gets other opportunities to demonstrate 
that he has achieved the objectives and thus he is not stigmatized as inferior. 

It is also much easier for the instructor to avoid being defensive about his 
assessment technique, to admit to its arbitrariness and to his human frailties 
in assessing some other person’s knowledge, and to discuss the material with 
the student to come to some mutual agreement as to whether further study would 
0 desi rab le. . 
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The single most important criticism of cr I ter i on-referenced assessment 
techniques is that you need to establish criteria or behavioral objectives for 
the course. For many courses this is no problem, especially if the subject matter 
Is logically organized into sequences of material, each level of which is necessary 
as a prerequisite for understanding the next level. The criticism is more cogent 
for courses in which creative products or novel solutions to problems are the 
goal. In these cases, by definition, you cannot state a strict behavioral objective 
ahead of time. Now consider a class in which each student is pursuing a different 
creative project as described. Usually the student (in consultation with the in- 
structor) establishes his goal — what he expects to gain from the experience. 

How, then, should each student's progress toward that criterion, or the product 
that results, be evaluated? 

The way it should most certainly not be evaluated is through some competitive 
procedure for the same reasons as given above, plus the additional reason that 
there Is no rational way to compare performances which have different objectives. 
Should the student's work not be assessed? Non-assessment Is reasonable only ]f 
the i nstructor is will ing to take no credit for guiding the student's thinking, 
encouraging his Interest, etc,, in which case the instructor is superfluous to 
the process. The only reasonable recourse, it seems to me, is for the i nstructor 
and student to establish evaluative criteria as the project evolves. In practice 
this would Involve student-facu I ty conferences in which the instructor gives the 
student feedback at various points in the special project. In this way the 
assessment process becomes an integral part of the learning-creative process, 
which Is as it should be. Though the criteria are not stated in pre-established 
behavioral terms, the process is still very much cri ter ion-referenced or, if 

you prefer, goa I -d i reefed with evaluative feedback from the Instructor. 
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One final point. Many will dismiss the idea of cri te ri on-re ferenced assess 
ment because of the greater amount of work it requires to develop such a system, 
especially for large classes. Such procedures have been successfully developed, 
b u+ they do req u i re more work, at least initially, than either no assessment 
or competitive assessment techniques. However, no one ever claimed teaching to 
be easy. More important, excel lence in teaching wi I I continue to be in the same 
short supply as it is presently unless faculty members adopt some variations of 
cr i ter i on-re ferenced assessment procedures. 
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